Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification
نویسندگان
چکیده
Short duration speaker verification is a challenging problem partly due to utterance duration mismatch. This paper proposes a novel method that modifies the standard Gaussian probabilistic linear discriminant analysis (G-PLDA) to use two separate generative models for i-vectors from long and short utterances which are jointly trained. The proposed twin model G-PLDA employs distinct models for i-vectors corresponding to different durations from the same speaker but shares the same latent variables. Unlike the standard G-PLDA, this twin model G-PLDA takes the differences between utterances of varying durations into account. Hyper-parameter estimation and scoring formulae for the twin model G-PLDA are presented. Experimental results obtained using NIST 2010 data show that the proposed technique leads to relative improvements of 8.5% and 15.6% when tested on utterances of 5 second and 3 second durations respectively.
منابع مشابه
CNN-Based Joint Mapping of Short and Long Utterance i-Vectors for Speaker Verification Using Short Utterances
Text-independent speaker recognition using short utterances is a highly challenging task due to the large variation and content mismatch between short utterances. I-vector and probabilistic linear discriminant analysis (PLDA) based systems have become the standard in speaker verification applications, but they are less effective with short utterances. To address this issue, we propose a novel m...
متن کاملModified-prior PLDA and score calibration for duration mismatch compensation in speaker recognition system
To deal with the performance degradation of speaker recognition due to duration mismatch between enrollment and test utterances, a novel strategy to modify the standard normal prior distribution of the i-vector during probabilistic linear discriminant analysis (PLDA) modeling is employed. This new modified-prior PLDA model incorporates the covariance matrix scaled with duration of each utteranc...
متن کاملDomain adaptation based Speaker Recognition on Short Utterances
This paper explores how the inand out-domain probabilistic linear discriminant analysis (PLDA) speaker verification behave when enrolment and verification lengths are reduced. Experiment studies have found that when full-length utterance is used for evaluation, in-domain PLDA approach shows more than 28% improvement in EER and DCF values over out-domain PLDA approach and when short utterances a...
متن کاملDuration Mismatch Compensation Using Four-Covariance Model and Deep Neural Network for Speaker Verification
Duration mismatch between enrollment and test utterances still remains a major concern for reliability of real-life speaker recognition applications. Two approaches are proposed here to deal with this case when using the i-vector representation. The first one is an adaptation of Gaussian Probabilistic Linear Discriminant Analysis (PLDA) modeling, which can be extended to the case of any shift b...
متن کاملDuration dependent covariance regularization in PLDA modeling for speaker verification
In this paper, we present a covariance regularized probabilistic linear discriminant analysis (CR-PLDA) model for text independent speaker verification. In the conventional simplified PLDA modeling, the covariance matrix used to capture the residual energies is globally shared for all i-vectors. However, we believe that the point estimated i-vectors from longer speech utterances may be more acc...
متن کامل